-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[Internal]Enabled Netty Buffer Leak detection and left-over CosmsoClient instance detection during test execution #47211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…mos/encryption/CosmosNettyLeakDetectorFactory.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…to users/fabianm/TestImprovements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive leak detection infrastructure for the Azure Cosmos SDK, enabling detection of both Cosmos client leaks and Netty resource leaks during testing. The implementation includes a new stack trace utility, client lifecycle tracking, custom Netty leak detection, and integration with TestNG test infrastructure.
Key Changes:
- Added
StackTraceUtilfor capturing and formatting stack traces to aid in leak diagnosis - Implemented client leak tracking in
RxDocumentClientImplusing a thread-safe map synchronized withstaticLock - Created custom
CosmosNettyLeakDetectorFactoryto detect Netty ByteBuf leaks with detailed logging
Reviewed Changes
Copilot reviewed 44 out of 44 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
StackTraceUtil.java |
New utility class for capturing and formatting stack traces with comprehensive JavaDoc |
RxDocumentClientImpl.java |
Added client lifecycle tracking with activeClients map and getActiveClientsSnapshot() API |
Configs.java |
Added CLIENT_LEAK_DETECTION_ENABLED configuration with temporary default value requiring revert |
CosmosNettyLeakDetectorFactory.java (tests and encryption) |
Custom Netty leak detector with TestNG listener integration for paranoid-level leak detection |
TestSuiteBase.java (rx, implementation, encryption) |
Added @BeforeClass/@AfterClass hooks for leak detection with snapshot comparison |
CosmosDiagnosticsE2ETest.java |
Added proper client cleanup in try-finally blocks |
Multiple *-testng.xml files |
Registered CosmosNettyLeakDetectorFactory as TestNG listener across all test suites |
KafkaCosmosTestSuiteBase.java |
Changed @BeforeSuite/@AfterSuite methods from static to instance |
pom.xml (kafka-connect) |
Added Maven Central repository entry |
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/Configs.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentClientImpl.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentClientImpl.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/rx/TestSuiteBase.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/TestSuiteBase.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/TestSuiteBase.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos-encryption/src/test/java/com/azure/cosmos/encryption/TestSuiteBase.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/CosmosDiagnosticsE2ETest.java
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/CosmosAsyncClientTest.java
Show resolved
Hide resolved
…to users/fabianm/NettyBufferLeakFixes
|
/azp run java - cosmos - tests |
|
/azp run java - cosmos - kafka |
|
/azp run java - cosmos - spark |
|
Azure Pipelines successfully started running 1 pipeline(s). |
2 similar comments
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Azure Pipelines successfully started running 1 pipeline(s). |
xinlian12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for this great effort to improve the test pipelines :)
.../main/java/com/azure/cosmos/implementation/directconnectivity/rntbd/RntbdRequestDecoder.java
Outdated
Show resolved
Hide resolved
…ntation/directconnectivity/rntbd/RntbdRequestDecoder.java Co-authored-by: Annie Liang <64233642+xinlian12@users.noreply.github.com>
|
/azp run java - cosmos - tests |
|
/azp run java - cosmos - kafka |
|
/azp run java - cosmos - spark |
|
Azure Pipelines successfully started running 1 pipeline(s). |
2 similar comments
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Azure Pipelines successfully started running 1 pipeline(s). |
xinlian12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
|
/check-enforcer override |
Description
The main purpose of this test is to hook up leak detection for Netty buffers and left-over CosmosClient instances in tests. The reason for it is to get better ocnfidence that there are no memory leaks.
The core logic is in
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/CosmosNettyLeakDetectorFactory.java- it automatically captures snapshots of already existing cleints and hooks up netty leak detection before/after tests for a test class are executed. This way any leaks are associated with at least the corresponding test class - for few tests whre I saw flakiness or I had to debug I also added netty buffer leak detection per method (but that comes with some overhead ebcause system.gc() is needed after each method with appropriate delay for GC to finish.Product code changes are gated on netty buffer leak detction being paranoid - so, should not cause any overhead.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines